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Ml evaltiation system for use with ESEA Title I 
prograiEs has been developed by Research Corpora-- 
tlon of Mountrf.an View^ California under eantraet with 
USOE. Tb.a system presently addresses eognltlve a-- 
ohievement Impact using three statistical deaignaj 
each of which may be Implemented using either norm-- 
referenced or nonnormed tests ^ The reading and 
iaathematics eomponents of the District of Coluir4>ia 
Title I program were evaluated using both types of 
tests and two of the thr#e models* Kie third analyaia 
design was tnitlally considered for implementation ^ 
but serious violations of its requirements by the data 
disqualified that model for eventual usage with this 
year's District of Columbia data. This paper presents 
a description of the three models and the results de-- 
rived from implementing two of the three. Additionally, 
a differential growth rate associated with development 
as evidenced in the norms tables of various currently 
used instruments is discussed and potential areas of 
further research are highlighted* 
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An Empirical Examination of Three Models 
for Estimating the Effects of No -Treatment 



The United States Office of Education CUSOE) contracted with 
EMC Research Corporation of Mountain VieWj California, three years 
ago to develop an evaluation and reporting system for nation- 
wide use with ESEA Title I programs. The resulting evaluation 
package consists of three statistical models, each of which may 
be implemented using either norm-referenced or criterion- 
referenced tests (NRTs or CRTs, respectively). These models 
address the cognitive impact of Title I programs as measured 
by achievement gains. 

The metric used to assess, program impact in this systeEi is 
the normal curve equivalent XNCE) * This metric is a normalized 
standard score which has been linearly transformed to match the 
percentile rank scale at the 1st, SOth|.and 99th perceritile 
points. The NCE scale is simply a standard score scale which, 
for ease of interpretation, may be viewed as an equal interval per- 
centile scale, NCEs have a range of 1-99, a mean of SO, and a 
standard deviation of 21.06, One advantage of NCEs is that, due 
to their equal interval characteristic, any mathematical operations 
may be performed. Another is that gain scores are easily computed, 
whereas grade equivalents and percentiles, which are not equal 
interval^ do not lend themselves so easily to gain score analysis* 
A further Inducement to use NCEs Is that, in the near future, USOE 
will probably recommend that they become part of the evaluation 
syatem. At the same time, one drawback of the NCE score is that 
it can easiiy__be_misinterpreted to be a percentile score, and vice 
versa; this type of misunderstanding facilitates both improper 
interpretation and manipulation of both NCEs and percentiles. 
Figure 1 illustrates the relationship between NCEs, percentilas, 
stanines, and Z-scores* A more complete discussion of these inter- 
relationships can be found in Chiang and Rosen, 1970. 

vFigure 1 
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Before discussing the cognitive achieveinent evaluation 
results s a brief description of the District of Columbia Title T 
program is necessary. The District of Columbia Title I program 
served approximately 17 , 000 students in grades and 7 

during the 1975-76 school year* They attended Title I eligible 
schools and fell below the fiftieth percentile on the Comprehen - 
sive Test of Basis Skills (CTBS) , Form S using the level of this 
instrument appropriate for the students* respective grade levels* 
National norms i^ere used for kindevgarten and first grade selection, idiile 
local norms were Implemented in the three upper grade levels* 
Both reading and mathematics were emphasized by the Title I pro- 
gram* Participating students in the program were exposed to 
supplementary instructional strategies, both in their regular 
classrooms and in special resource laboratories, A primary objec- 
tive of the District of Columbia Title I program is to effect 
significantly enhanced levels of achievement in both reading and 
mathematics* * 



To assess the Impact in cognitive achievement , the CTBS/S was 
given to the Title I students In the fall and spring* An existent 
dlstrictwlde testing program additionally supplied spring criterion- 
referenced test CCRT) scores for Title I students, both for the 
1974-75 and the 1975-76 school years* The CRTs used were the 
Prescriptive Mathematics Test CPMT] and the Prescriptive Reading 
Test (PRT) * These CRT scores enabled a spring- spring^nalys is to 
be performed,, in addition to the fall-spring analysis using the 
CTBS/S scores. 



Program Impact in Reading and Mathematics 

In terms of evaluation, at least two types of Information 
are needed to determine whether a Title I project has resulted in 
Improved student performance* The first involves an assessment of 
how the project students perfoTmed on outcome measures such as 
readlng^ comprehension and mathematical computation after partici- 
pating in the Title I project* The second requires an estimate of 
expected student accomplishment , given the provision that the 
students have not participated In the program* If the observed 
accomplishment of project students exceeds their expected perform- 
ance^ and if the difference is both statistically significant 
(manifesting a greater difference than can be attributed to chance 
fluctuation in the scores) and practically relevant (large enough 
to be educationally meaningful) V then the Title I project is 
considered to be educationally salient. 



it is a relatively straightforward procedure to calculate how 
well the project students performed on the outcome measures, but 
it is considerably more dlf the project stu- 

dents would have performed with no treatment* Several approaches 
are available for assessing *'no-tfcatmoht perfonnancQ" or what tho student 
would have achieved had there been no special project. This next 
section presents tho results of two such approtichcs to estimating 
program impact. 



^Earlier it was stated that a primary objective of the" 
Washington, D. C. Title I program is to improve reading and 
mathematics achievement among participating students to an extent 
that IS statistically and educationally significant. Within this 
framework, treatment effect is the observed posttest performance 
minus the expected no-treatment posttest performance. Thus» 
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The observed post treatment performance is simply the mean post- 
^ I students on eithe;r the CTBS/S or the PRT 

and PMT. The no-treatmient expectation is derived usine two 
of rS in an effort to cortverge on a valid estimate 

of impact (Bessey, Rosen, Chiang, and Tallmadge, 19763, 



Norm- Referenced Model Results 

With the norm- referenced model, the impact of the Title I 

??nfJn? w^fT,""?^-®! ^- follows. The pretest percentiles of each 
student withm the treatment group were converted to NCEs and 
averaged. A similar procedure was followed for posttest scores, 
finally, the average pre and post NCE values were compared under 
the assumption that, without the Title 1 program, the treatment 
group would maintain its standing relative- to the norm group. 
Stated another way, the pretest and posttest mean NCE scores 
should have been similar if the project had had no impact. 

. There are four assumptions which should be met if this model 
is to yield an unbiased estimate of program Impact: (1) the 
pretest^shouid not be used to select project participants; (2) 
the test must be given at the timeCs) of the year when the' test was 
norined, (3) comparable pretest and posttest forms must be used; and 
\^ ?5 ? those students having both pretest and posttest scores 
should be used m the analyses. =TFepresent"appiication o£ this 
model at the second, third, and seventh grade levels meets all but 
one of the assumptions. Both the kindergarten and first grade data 
however, satisfy all of the requirements. The CTBS/S was normed 
only m the spring for the second, third, and seventh grade levels ; 
subsequently, fall norms were linearly interpolated from the spring 
data. To the extent that student learning throughout the year is 

'fjdel may yield a biased estimate of program impact 
at the second, third, and seventh grade levels. 

The pre- and posttest results expressed in NCEs for kinder- 
second, thirdi and seventh grade Title I students 
whi 1 utilized m order to illustrate the gains in achievement 
which Jitle^I^students enjoy. Figures 2,3,4,5 , and 6 present 
the actual data for grades K-3 and 7, respectively, o£ the D. C. 
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Title I program. * The mean pretest-posttest differences for all 
CTBS/S scales presented in these figures are statistically signifi- 
cant at a confidence level- greater than 0.999 Cp < 0.001) except 
for the Reading scale for seventh grade. This scaTe~Cwe^-F^^^ 
6) displays a statistically significant difference at a confidence 
level of 0.99 Cp < 0.01). The mean differences range from 1.6 on 
Reading m grade three to 12 . 0 on the Total Battery for grade two. 
The median of these mean differences is approximately 6.65 across 
the five grade levels. The Mathematics mean differences tend to 
surpass those on the Reading scale for grades 1 ,-=3, ~and-7-but-^ 
the second grade. Using a rule of thumb applied by Resource Manage 
ment Corporation, exemplary gains are denoted by mean pre to post ' 
differences of 7.0 NCEs or more. Hence, using at least the Total 



* Each scale on the CTBS/S, including total scales, are standardized 
and normed separately. Hence, th6 total battery score is normed by 
taKing the score derived from all the items on the CTBS/S and not 
-by ±orming -a linear composite of the three skill areas total scale 
scores. . . ; ^ ■ 



Battery scales, exemplary gains have been shown in grades 1-3 and 
in the prereadlng component in kindergarten. In the mathemati.cs 
component o£ seventh grade, the gain can also be called exemplary. 

Relative to the scale standard deviations, the pre- and post-^ 
test differences depict even more sharply their significance. 
Thirteen of the eighteen scores for scales represented in Figures 
2 through 6 have mean differences which are at least one-third 
as large as the corresponding standard deviations , The Visual and 
Auditory Discrimination scale in kindergarten, Reading and 
Language scales at first gradey and the Reading scales at the 
third and seventh grades Ao^ not have mean differences which are 
at least one-third as large as the scale standard deviations. 
On nine of the eighteen scales^ the mean differences are at least 
half the size o£ the corresponding standard dev total 
Battery mean differences at the first and second grade levels 
particularly illustrate this point. All of these results for the 
five grade levels lend firm support to the contention that treat- 
ment effect is distinctly visible* 

As noted earlier, the data for grades two and three violate 
one of the assumptions of the norm-referenced model. However^ the 
kindergarten and first grade data, which do satisfy all of the 
requirements of this model ^ reflect— statistically significant 
differences between pre- and posttest means for all of the CTBS/S 
scales, A similar result is documented by the second and third 
grade data j although the results at these levels are somewhat more 
substantial than those at the kindergarten and first grade. Thus, 
it is possible that the violation of the one assumption at the upper 
grade levels does not seriously and adversely affect the inferences 
which may be drawn from the second, third, and seventh grade data. 



Control Group Model Results 

..n.Pf^^®^^'^! ^^^-^^^^^ design Implemented in this evaluation is 
called the control group model. As its name suggests this evalua^ 
tion design calls for the construction of contrfl and ^ treatme^^ 
groups, both selected at random from an Initial population of 




eli|ible Title I students • The initial population should be as ' 
similar as possible with respect to all educationally relevant 
characteristics, such as age, sex, race, ethnicity, socioeconomic 
status, and measured, pretreatment achievement levels* After 
assigiunent to the treatment or control group, each student is 
taught and treated equally, the single exception being the appli- 
cation of the Title I program services to those students in the 
treatment group* The observedipost treatment effect is derived 
from the actual average performance of the treatment group. The 
expected no- treatment effect is represented by the measured aver- 
age performance of the control group. 

In the present application of the control group model, raw 
scores on criterion-referenced reading and mathematics tests are 
compared, NCB gains can be derived, algebtAlGal,lY,. .by..41viding 
the difference between the treatment group's posttest raw score 
mean and the no-treatment expectation by the standard deviation of 
the national sample and subsequently multiplying by 21,06, Through 
this procedure raw score gains can be converted to NCE gains. Un- 
fortunately, there is no national sample standard deviation for the 
PRT and PMT, and it becomes necessary to make the following assump- 
tion: the ratio of the treatment group - s standard deviation to the 
standard deviation of the national sample on the norm-referenced 
test is equal to the ratio of the treatment group *s standard 
deviation to the national sample's standard deviation on the PMT 
and PRT. That is, 

^NRT ^urt 

where "S'' represents the treatment group's standard deviation and 

represents the national sample's standard deviation. Since the 
two treatment group standard deviations can be calculated from the 
collected data and the standard deviation of the national sample 
on the normed test can be obtained from that test ' s technical manual , 
the estimated national sample's standard deviation on the PMT and 
PRT test can easily be derived CTallmadge and Wood, 1976) , Under 
the above assumption, the raw score gains have been converted to 
NCE gains to permit comparisons between treatment effect estimates 
yielded by the norm-referenced model and control group model, 
respectively. 

Title I schools are selected according to a weighted index 
comprised of the total number and percentage of economically dis- 
advantaged students as indicated by eligibility for free lunch and 
low family income. The control group model contrasts thirteen 
schools immediately below the cutoff with thirteen schools immedi- 
ately above the cutoff. The rationale for the model is this: 
among the schools near the cutoff, it is largely chance which 
determines eligibility for Title I services* In other words, the 
schools Immediately above and below the line do not substantially . 
differ on educationally relevant variables. Thus, those schools 
not receiving Title I services can fairly act as a control group 
for those schools which operate Title 1 programs* This is because 
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student s^^^^m schools, even though their achievement 

leyels might: inMcate a need for supplementary aid , do not receive 
any Title I services. Tables 1 through 6 give the pretest ,posttest^ 
no- traatment expectation, and treatment: effect on the Prescriptive 
Reading Test and Prescriptive Ma for 
nori-l^itle I and Title I schools for first , second , and third grades . ^ 

> The means for Title I and non- Title I third graders were not 

statistically, significantly different * Many of the students in 
non-Title I schools this year participated in the Title I program 
last year; they are enrolled in schools which are not eligible 
for Title I funds thisyear but were eligible last year, Hen^ 
some of these students actually received ; supplementary se last 
yeaj, — If the Title I program was effective in the 1974-7S school 
year, then the current second and third grade students in non-Title 
I schools near the cutoff might_be expected to display higher- scores> 
as a group , than they would have had their schools not received 
Title I services in the previous school year. In other words, the 
treatment effects of the Title I program in the 1974-75 school ^ 
year would continue to influence the achievement scores of those 
students who had been in the treatment group that year,. This effect 
is sometimes called statistical contamination: ; the non-Title I 
second and third grade students in the current school year are 
not free from the Influence of the previous year's Title I program* 

TibiD 1: .." / 
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^Seventh grade students were not included in the control group 
inodel analyses because the appropriate control schools were not 
designated in time to be included in the comparisonSi 
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In this TablF '*Non-Tltle I" fefofs to students in non-Title I sdhOQls^ whefsaf elsewhere iht term refers to Ineligible students 



within eligible syhQOls. 

Thy no -treatment aKpectation is an estimate of the poS'CtaSt 
score that Title I students would have attained had they riot parti= 
cipatad in the program. The no- treatment expectation was 
determined by adjusting the observed posttest for differences in 
pretest scores between; students in -Title I and non- title 1 schools . 
An examination of pretest differences between students in Title I ^ 
and non- Title I schools revealed that thiF adjustment was important, 
because students in non- Title I schools consistently outperformed 
students in Title I schools on the pretest. A straight comparison 
of posttest scores for the two groups would be inappropriate; given-^ 
that students in non-Title I schools had an Initial advantage. 
Because it was expected that students in non- Title I schools would 
show an initial advantage^ a- principal axis adjustment rather than 
a covarlance adjustment was employed (see Kenny^ 1975), 

- On_the average, - first grade students in-non-Title I schools 
have slightly higher pretests than first grade students in Title I 
schools* However^ this pattern reverses on the posttest , with 
students in Title I schools showing higher Total Reading and Total 
Mathematics scores than students in non-Title I schools, ^ This find- 
ing provides a strong argument for treatment effect at the first 
grade level. The results for the control group model do not indicate 
a Title I program impact at the second grade ^ although a moderate 
level of impact is found at the third grade. One plausible explana- 
tion for the absence of an effect at the upper grade levels is that 
some^ of these -students benefited from the Title I program in pre -^^^"^^^^^^^^^ 
vious grades. Given the strong effects at the second and third ^ ^ 
grade. levels yielded by the norm-referenced model, it seems possible 



, that the control group model is failing to identify an effect 
because the control group is contaminated :with last year rs tre 
ment. However j the finding of a moderate effect at third grade 

\\ raises doubts about the possibility of a contaminated control 

group. Another explanation is that certain assumptions underlying 
the present application of the control model are faulty; thusj the 
mpdei yields an inaccurate estimate of program effect. 

..: :=:.-""■ - Table' 3 ^ ■ ^ ^ J r : - - ■ ; ' ; 
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CompfL'henifen and 
tntyfpferiijion 
K 


30,4 
S,8 


27.9 
10,2 


42,4 
10,0 


41^4 


43.S 


•2.2 


Study Reading 
X 

s,b. 


14.7 
4.7 


13.5 


Ig.B 
4.5 


18.5 
4.1 


19,6 


-3.5 


Tptcil Reading 
X 


73,2 
18.9 


69.0 
18.D 


93.1 
17.6 


1)3,5 
IG.l 


97,3 


-2,0 



In this Table "Npr^-Titln r* refers to students In nan-Title I schoaii, whereas eUL-where the lofrn refers ta Ineliglblt 
. students withlj elmibte sehools. ; ' 

. Table 4 

Soeond Grade PretesE, PesttosE, and "Np-Treatment** PBsttsit E?^pectat|an for Students In Title I Ochoals CN=4S2) and 
Students In Nen^TUIe I Schools (N^28y) en The Pfys grjgtiye MnthP^^ Test SubtRsts 





Subtest 


Pfetest Raw Scare 
Title I Nsn-Tltle 1 


Pasi 
Title 1 


test Raw Seafi 
Non-Title I 


No-Tfeet- 
^ rneht Pqi.t* 
test Expee^' 
tii{jon Raw 
Sqere 


fre.ntpriynt liffeut 
In Narmal Curv# 
Equivglgnt^ 




Sets ^nd Numbers 
















X 


18.1 


17.3 


23,5 


22,3 


23.1 


1,4 




s.o. 


4.8 


5.0 


4.8 


4,3, 






Niimerstian 
















K 




S.3 


7.B 


7.a 




-1.1 




S,D. 


2.0 


1,8 


2.3 


2,2 








Oprratians 


















19.0 


19,0 


20.8 


20,1 


2a.7 


0,3 : ; 






S.9 


6.0 


6.7 


7.4 








ProtjftMti SoK'iiig 


















3,2 


3,1 




4,G 


4.7 


-1.2 






1.6 


1,f? 


1»4 


1.0 






Mnu^iirvincnt 
















K 


7.0 




9.7 




0,3 


2,0 






2.7 




2.C 


2.5 
























k 






c.s 


6.1 


G.rt 


2,1 




; K.n. 




1.7 


1.7 


1.0 


























5U.C. 


00,4 


7U,0 




mi 


0.9 






14,7 


13.8 : 


ia,8 


1&,7 
























1 iKtiii i T*ibl** i'H tiV 


=Titfi' 1 * fefvrs JO slutft^fit'* in niJn-Titl« 


1 schuols 










Witfiin I'ljyihiy ishciuls 












CD ir e 
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.= ' ->,.-; .... . .Table -S 

Third Grado Prsiust, Psitiest, and •*Ns*Treatment" Psittsit EKpeet^tton for Students In Title I ScHoofs {N"313) 
and Students In Non-Title I Schpgls <N^98) on The Pfesgfititive Rc?3dino Test Subtesti 



Subtoit 



Word PerccpUdn 
X 

Comprchehslon and 
intDrpretation 

X 

Study Reading 

S.D. 

Tatat Reodino 
X 

S.D. 



Pretest Raw Sesro Posttes^ Row Score 

Title I ^ Nsn'Title I Title I Nori^Title I 



Mo-Treatment 
latfon Ravv ' 



31.04 



28.87 
8.10 



24.33 
6.50 



84.24 
1S.SB 



31.14 
S.72 



29.76 
7.37 



25.40 
6.1D 



80.30 
17.00 



34.33 
0.85 



33.10 
6.81 



23.17 
©.49 



9S,74 
16.74 



34.40 
5.4e 



33.18 
7.10 



29.66 
€.34 



97.23 
16.9S 




Tremfncni Effect 
In Norffial Curve 
Eqy ivstiints \ 



0.2 



1.S 



1,2 



2.3 



In fhU table "Non^TWe 1" raf^rB to studsnU in nen^TlUs I fchosif, whereas elsewhsrs tho term refers to inelfgibfe 
Stydent^wlEhin eligible schools. 



- Table 6 

Third Grade Protest, Posttqst, and "Np-Treatmenr* Fosttost Ex'peetatron for Students In Title I Schools (N^326) 
end Students fn Non-Title I Schools CN^109) on The PFeserlptlwe Mathgrnaties Test Subtests. 













NO'^Treatmeni ^ 


Treatment Effeei 












Poittest Expee^ 


In Normal durve 




. Pretest Raw Seore 


: Post test Raw Score 


fitlon Raw ■ 


Equivalents 


Subtest 


Title i 


Non^Title 1 


Title 1 


Non^Titlo 1 


Seo/e 




Sots and Numbers 














■ - X : ; 


14.02 


I4.8e 


16.42 


16.18 


16.4 


4.4 - ~ 


S.D. ■ ■ • 


3.EiS 


3.23 


3.21 


3.03 




Numeration 














X 


7.27 


8.09 


9.47 


10.02 


9.2 


1.2 _ 


S.D. 


3.03 


3.16 


3.12 


2.85 






Operations --^^ 














X 


22.04 


' 24.27 


31.73 


32.78 


20.7 


1.9 


S.D. 


7.80 


8.33 


7.07 


' 6,67 




Problem Solving 
















5.37 


5.91 


6.74 


6,77 


6,4 . -i 


2.9 • 


S.D. 


2.41 


2.3S 


1.82 


. 1.80 




Measurei>^<>ni 














X . • ■ .-. .- - • 


13.8G 


14.GQ 


16.44 


* 16.76 


18.0 


1.6 


S.D, 


3.88 


4.16 , 


3.64 


3.36 




OROrru'tric Conec?pts 














- ^ X=,../ . .. 


LOG 


2.10 


2.3Q 


2.25 


2.0 


3.5 


s.o - •■ 


1.12 


1.13 


1.20 


1.15 




Total IVlifthematies 














X 


G4.42 


00.90 


03.10 


£?4.V0 


V3.5 


3.0 


. :._ S,D. : . 


_ IG.liS _ 


17.02 ^ 


_ 10.55 ^ 


_ _ _ 14,44 






























In this f.itJto '-Nan-Titlu 


1** fi^fcr« to stii^li 


'fits in fion^Titiu 1 s< 






T tqrin fffers tn inMlfijihlt* 


stiiderits vvithin rh^ibty 


St' hoots. 













The first grade and third grade results from the ^^^c 
group model corroborate the findings of the norm- 

and confirm that the Title I program is a statistically and 

educationally significant impact on student reading and mathematics . 
achieyement. The fact that two models using different achievement • 
tests converged on a similaT estimate of treatment effect strongly ■ 
indicates that the estimate is valid. ; The fact that^the two models 
do not converge on a similar estimate of treatment effect at the 
s^econd grade ^ in light of the findings at the other two grades^ is 
bast considered a sampling anomaly; Replication of this analysis 
next year should afford additional insight into these second 
grade results, . 



Special Regression Model 

This -statistical design^ as its name vimpliesy: is based on- ^r; 
regression methodology . As with the two previously discussed models^ 
the method which is used to derive the no-- treatment expectation^^^^^^^^^ ,^ 



determines the model .which is actually 

groups are selected on thei basis of their 



treatment and comparison 
pretest sc 

The treatment group v^** ui.^ w^n^xi i.^ wi l i ujue j. yrugx 

Both treatment and comparison groups are posttested using an Instru 
ment which correlates highly with the p tu^ ^u^^^^t^a 



ana comparison groups are selected on the basis of their 
lores using a fimly established^ strictly enforced ^cutoff, 
fient group is then given the benefits of the Title I program 
tment and comnarison ffrouDS are /no fittest n*^in & an i n ^■h^ii- 



The observed 
' *^ average 



iii^jAy wiiiuji uui rexa^e^ nigniy wixn rne pretest aevice. The ODser^ 
post treatment performance is- actually -the treatment g 
posttest score* The no-^ treatment expectation is derived from a . ^ 
pro j ectlon of the regression line deterinined by the comparison group^^ s 
pre- and posttest scores. The actual treatment effect is measured at 
two points, as Indicated in Figure 7* 



A. At the pretest cutoff score. 

B * At the treatment group 's average pretest score, 



Figure 7 



Trefttfienc Proup 



QbierVed wssa 
pomttmkt score < 



posttest sgsre 




sbs^fvEd Values 



TRETZST SCOtLEB 



The purpose o£ measuring the treatment effect at these two points 
along the pro jected regression line of the comparison group is to 
determine whether the relationship between pre- and posttest for both 
treatment and comparison groups was the same • ^ 

There are five assumptions of the model which should be met if 
this model is to yield an unbiased estimate of program impact: 

(1) the pretest and the selection test must be the same instrument; i. 

(2) the pretest or posttest should be given at the empirical norma- 
tive data point of the instrument; (3) "the pretest and posttest 
should be highly correlated C^- 0, 60) • (4) there should be a strict 
cutoff score for determination of placement in the treatment group 
and the comparison group; (S) only those students having both pre-^ 
test and posttest scores should be used in the analyses. 

The model is predicated on the supposition that the no- treatment 
expectation of the treatment group can be calculated using the 
regression line determined from the comparison group;- A strict 
cutoff score for determining placement in each of the groups is 
necessary. Otherwise, the pretest standard deviation is inflated 
and the regression line of the comparison group is flattened. This 
will systematically bias the results against showing a positive treat 
ment effect for Title 1 programs. 

This model was rejected for implemantat ion in the evaluation 
after determining that one of these requirements was violated by 
the data, a violation which would render the resulting measured : 
treatment effects questionable if not : totally misleading* The 
evaluators found that a clearly defined cutoff was not enforced 
in every case. Primarily for this reasonj the special regression 
model was not; implemented using the District of Columbia data. 
A poorly implemented evaluation model can be not only confusing 
but totally misleading and should be avoided for this reason. 
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standardized Growth Expectat^ 
Some Findings and Implications 

Most educational evaluations^ including the present ones 
ignore what may be a critical factor when estimating whether a pro- 
gram has had an impact or whether students have learned more at 
one grade level than another .It is typically assumed that a treat- 
ment effect of seven NCEs (one-third standard deviation) has the 
same meaning in first grade as in seventh grade* What is not. 
typically considered is that the expected groiwth is different in 
first grade and seventh grade. Another way of viewing the issue 
is to ask whether a student would lose the same amount . (relative to 
national norms) in reading achievement if he/she fell asleep for 
all of first grade or all of seventh grade. This is the same as" 
askingrhow much growth does the average student make in reading 
achievement during first grade^ and Is it the same as the growth 
realized by the average seventh grade student during one school 
year? - : ^ ^- : . 

. An answer to the above question can be approximated by assum- 
ing that a student will attain the same raw score on the pretest 
and posttast if .no .learning has taken place. If the pretest raw 
score is equivalent to a national percentile of SO and the same 
raw score is entered into the posttest percentile table ^ the _ 
resulting percentile score will be less than 50. The difference 
between the pretest percentile and the posttast percentile expressed 
in standard score form is the standardized growth expectation (SGE). 
The ..SGE is the amount that a student learns over a period of time 
or ^ conversely^ what the student would lose if ha/ she fell asleep 
and learned nothing. An example may help to clarify the computation 
procedures used to calculate SGEs. Table 7 presents a raw score 
to percentile conversion for beginning of first grade and and of 
first grade on the Total Reading scale of the GTBS/S . The average 
(50th percentile) beginning first grade student attains: a raw score 
of 31 on Total Reading. Under the assumption that this average . 
student learns nothing in the first-grade, he/she would be expected 
to obtain again a raw score of 31 on the posttast. Whereas a raw 
score^of . 31 :isj equivalent to a. beginning first grade, percentile of 
SO, it represents an end of first grade parcantila of 9. If both 
percentiles are converted to NCEs (SO SO ; 9 -^ 21. 8) and subtracted ^ 
the result is an SGE of 29.2. In other words ^ if an average student 
falls asleep and learns nothing during tha first grade, he/she 
would be expected to lose 29,2 NCEs because that is the amount of 
standardized growth exhibited by the national norm group during the 
first grade. Yat another way of viewing the SGE is to consider it 
as an estimate of the effect of school, home, and social forces 
(such as radio and television) on first grade students \' reading 
achievement.* - . ^ 

^The SGE differs slightly depending upon where in the pretest dis- 
tribution the raw score is selected to be entered into the posttest 
percentile distribution. For ease of presentation, this difference 
is Ignored since it does not influence the general conclusions . 
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Table 7 . 
Raw Scow to Percenlife Table for*Beginn!ng arid End of First Grads on CTBS/S | Level B 
. ' Total Reading 



Bcylnning of FiirsE Grade 
^aw Score Percentile 



7384 
8672 
65-67 
01 64 
59 60 

87- 88 

88- B6 
S3-S4 

52 



99 

93 

97 

9€ 

95 

94 

93 

92_ 

91 



31 
31 
31 
31 
31 
30 
3D 
30 
30 
30 



SO 
49 
48 

4^ 
.46 
"45 
44 
43 
42 
41 



20 
10 
18 
18 
18 
IS 
17 
-16 
18 
0-1-1 



10 
9 
8 
7 

a 

5 
4 
3 
2 
1 



End of First Grade 
Raw Score Percen til© 



84 

84 

84 

84. 

84 

83 

83 

82 

82 



99 
93 
97 
98 
95 
94 
93 
92 
91 



59 

5B 

88 

87 

SB 

SB 

84 " 

53 

53 

82 , 



50 
49 
48 
47 
46 
45 
44 
43 
42 
41 



32 
31 

- 30 . 
29 
28 
27 
25-26 

24 
21-23 
0-23 



10 
9 
8 
7 
€ 
5 
4 
3 
2 
1 
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Table 8 presents SGEs in reading and mathematics for grades 
1 , 2 ,3 j and 7 * To facilitate comparison o£ information from the 
findings of this section with the rest of the chapter ^ all SGEs are 
presented as normal curve equivalents (X^SO ; D, -21 , 06) . All*^^^^^ ^ 
:SGEs in Table 8 are computed from the norms in the publisher' s 
manual for the CTBS/S and CTBS/T (for seventh grade)™ The procedure 
used to compute SGEs is identical to that described in the para- 
graph above. -..-.^ . - .v-:-.::.^ 



Table 8 

StiindBrcM^ticI Growth Expectations (SGEs) Oxprosscd as NCEs in Total Reading nnd 
TotuI Mnthcmatics for Gridos 1^ 2, 3, and 7 ' 

Grado 1 Grade 2 Grade 3 Grsdo 7 

Total Reading 29,2 9.3 BJ 3.7 

Totnl Mathoinatics 21.8 12.3 : 11.?: ^ 3.2^ 

Tho period uf p'Ov^vth in each case is fall to sprincj of the school year. The 50th porcentlle point 
was used for entry into the pretest tablDS. 



Table 8 reveals that the SGE for second grade Total .heading 
IS only one-third o£ the SGE at the first grade level. Similarly ' 
the second grade SGE for Total Mathematics is about four times larger 
than the SGE for grade seven. This seems to indicate that the rati 
of growth IS different from grade to grade and, in particular, the 
rate slows with each additional year of schooling. The SGEs "for 
height 'and weight computed from birth to eighteen years of age 
follow a similar pattern of deceleration. The. largest SGEs ippear 
during the first few years of, life and gradually diminish until 
eighteen years of age when the SGE is less than one NCE point. Table 
9 Illustrates this phenomenon for height, viewing ages four to 
eighteen. ■ 



Table 9 





SGEs 


Means, Standard 
for Weight Expressed in 


Deviations, and 

rOUiiQs at Various 


■Age Levels 


(In 


AGE 

I Years) 


MEAN 

Weight 

Pounds 


' DEVIATION 


, SGE 


4 






35.7 


.4.2 


19 6 


s 






40.3 


4 9 


17 1 


6 






45. 3 


5.9 


16- 7 


7 






50.8 


6,8 


13. s 


8 






56.6 


8.2 


14 . 2 


9 






62.6 


10,1' 


13,5 


10 






68.9 


12 -3 


13- S 


11 






75.9 


16. 2 


15.6 


12 






86.2 


20.0 


14, 9 


^ 13 






99.2 


22, 5 


14 . 9 


. 14 




- 


113.2 


24. 2 


13. 5 


IS 






126.5 


25.1 


11.7 


16 






137.7 


25.2 


8 /I 


17 






H D m 


24.6 


2.1 


18 






147.2 


23.9 

_ _ — 

















' r::.. Tables 10 and 11 further depict the differential growth rate 

across development in the cognitive domain* The varibus inconsis-^^^^^^^^^^ 
tencies which exist across several test in inaasurlng the same 
: cognitive construct is illustrated in Table 10 . Notice the disparate ; 
rates of decline on each scale for the SGE using the ITBS -and the 

, V CTBS* Also, consider the different ^^S oh the CTBS alone, de-^^^:--^^ 
pending on whether national or big city norms tables are used- 
Apparent ly, something happens to the normative populaton* at the end ' 
of third grade which causes the mean achievement scores for each 
percentile group during the succasstve year to increase, thereby ; 
y inflating the SGE, This is probably caused by an unusual propor-- 
tion of the lower achieving students somehow being excluded from 
the normative population after third grade : when a large percentage 
of low scorers drop out of the norms, then the average score at all 
percentile levels increases. This occurs again at the end of 
p grade nine and is most likely due to the large number of low 

scoring dropouts exiting the educational system. 

This "dropout effect'* discussed above does not seem to occur 
in the area of cognitive ability at either grade four or ten, 
Rather, the SGE steadily declines ^ year by year, for the verbal 
area. Quantitative ability behaves similarly , but the SGE hopT 
up noticeably in the eleventh grade. This may also be a reflection 
of the "dropout effect" as evidenced with the achievement instru- 
ments* Although nonverbal IQ for one-,test CIPAT^ 
/ decline, its evanescence on the other test is not so consistent* 
« " This discrepancy is due either to the differences in item content between the 
two tests, cohort effects, or differ ances in normingsamplesv 

Table 11 displays the decline in SGEs for ability as measured 
. by the Wechsler Adult Intelligence Scales CWAIS) from ages sixteen 
to seventy-f our. Notice that the standard deviat ion f or the norm 
- group on all three measures of ability (verbal^ performance, full 

scale) remains relatively stable across this vast age range, " 
Also, it is easy to recognize the effects of aging and senility 
upon the tested ability level of the senior citizens group Csixty- 
five and older) : the SGE loss suddenly doubles (triples in the 
case of verbal ability) after agjB sixty- four - Interestingly , the 
SGE loss for performance ability is nearly equally substantial > 
between ages twenty- five and fifty- four. ' • 

- Table 12 introduces two new variables , i^jto the investigation 
of developmental SGB decline* If the amourift r^^^ calcula- 

: ■ : t both for different percentile ranks and for the, 

regular school year arid summer , yet anotherSt^pe of different ial 
growth is revealed.- The Metropolitan was empirically normed in ; v 
both the f al 1 and spring, thus permitting a co^cnparison between - r v ^ 
achievement during the summer and that during -ijie regular school 
year. Summer achievement gains approximate or surpass those made - l 
during the regular school year for the upper percentile students 
The fact that 25th percentile^ students/ Seem to keep pace dur 
r school year but lag behind during the summer has some, interesting 

1 7 - ' " -^^ imp ti c a 1 1 o n s - f o rr c omp ens a t or y re due at ion: pro grams : : : ; r 

*The po s s ib i li ty of cohort e f f ec t s mus t al so be cons idjercd 
:::y (Baltes - arid" Schaie ^ Kv:^ ^: ■ ' .^^v.:- ... 

. ;-...-:vv.v-..,...,,., :.::2 3 . , 



Table 10 



SGEs for Siviral Aeliievtient/Ability Subtsiti 
Exprisied as Nermal Cum Equmlents (! * SO; S.D. « 21.0$) 

A^es at Succgssiva Gitde Levels ■ 



Standirduid 
'f-'Achieveient Teiti 



Agi: 
Gride I 



fii' I 



\^cabulary 
TIBS " / 
CTlS.'KatignalNoTis 

- CTSS: Big City Noiroi 

Reading Comp . 

"^ITBS ' 
CTBS: Natioml Nom 
CTBS:. Big City Noiiiis 

Total Linguigf 
fi 

CTBS: National Noms 
CTBS: Big City Nomis 

Total Math 

"CTBS: National Xqiu 
■ " CTBS: Big City to 

>Co|nitivi 
Abilities Test 
Verbal ■ , ' ■ 

. Quintititive . 

' Konvarbil ■ 



iMPAT Culture' 
''4!air Test- . 



10 

4 



12 
6 



13 



14 

■8 



13. S II 12.3 11.7 g.7 -6.4 

21.S IS. 5 -10.1 .10.4 . 7,0 ej 6.4 
(14J)^ [7.0) (E.3) (6.4) (1.1) 



29.6 22.2 14.9 
(13.1)- 



27.0 11.0 
1S.6 24.7 14.2 
(14.§) 



14.7 13..S 
27.0 -' 24.7 20.9 

(13.S) 



1§.3 
5,9 



13.E -"ISJ ^12.3'*'X7"" 'S.y 
■5.1 a.l 4.8 1.3 U 
(6.4) (4.0) (1.3) (4.8) (1.4) 



IS. 6 n.7 ■ 10.4 ail S.0 
8.7 .8.7 '4.2 4. a 4.3 
(6.4) JS.3); (S.3) (S.9) . (1.6) 

1§.3 17.0 14.3 nU §.1 
11.0 , -12.9' 7.-S ".6;4 "*6.4 
■(^.4) (7.S) ill) (5.3) (2.6) 



13,S ,12.9; ■l0.4' 7.S ■ '7.0 
15.S 10.4 ■ 8.9\-7,0 ■■7. 3 
. 8.1 7.0 4.2 hi 3.7 



IS 



3.7 
3.7 



2.6' 
6.4 



7.0 
6.4 
S.9 



16 17 
10 11. 



3.7 3.2 

7.0 ' S.3 



2.6 1.1 
7.0 3.S 



U U 
3.7 10.4 
4.8" 3.2 
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4.8. 1.3 S.9 U 



3.2 
3.? 



1.4 7.S 4.8 4.2 



2,1 
2.6- 



J^;;:;(KQn\'grbariQ)' 



8.4 11.7 9.S 6.8 ' 6.8 6.8 



Table 11 



SGEs Expressed in Standard Deviation Units CZ Scores) for the 
Wechsler Adult Intelligence Scales fWAIS) 



AGE GROUP 


VERBAL IQ 


PERFORMANCE IQ 


FULL 


SCALE IQ 




■ SGE . 


S.D o£ 




S.D. o£ 




S.D. of 




Norm Group : 


SGE 


Norm Group 




Norm Group 


10^ 1 / 






+n n? 
~u • u / 


J, J. • i3 


+ 0. 13 


25.2 




+0.13 


14. 9 


+ 0 07 


11 ft 

. JL JL k 0 


+ 0.13 


25. 7 


20-24 


+ 0.13 


15.2 


-0.13 


:;:--'i2.o J- 


+ . UU . 


Z4 . 8 


25-34 


-0.07 


14.6 V 


-0.30 


11.8 


-0.13 


24.8 


35-44 


-0.13 


14.9 


-0.40 


11.3 


-0.Z6 


26.2-** 


457S4 


-0.13 


16. 2 


-0,33 


11.3 


-0.26 


25.8 


55-64 


-0.13 


16.4 


-0.20 


10.8 


-0.20 




65-69 


-0.40 




-0.45 




-0.40 




70-74 


-0. 26 




-0.4S 




-0.40 





Table 12 



SGBs Expressed as NCBs (5-50; SD=21.063 
for Metropolitan Achievement Test (MAT) ^ 
Tor.al Reading and Total Mathematics for the 
2Sth, 5 0th, and 75th Percentiles 



GRADE LEVEL 




READING 






>IATHJI^XICS 






25 th 


50th 


75th 


2 5th 


50 th 


75 th 








Summer 


11.6 


11.0 


13.1 


9.5 


8.7 


10.0 


^ ^ 2 ^ 


16.9 


17.7 


10.5 


9.6 


16.3 


8 . 3 


Summer 
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ERIC 



As pointed out earlier. Title I evaluations assume that an 
NCE gain, or treatment effect, of seven points means the same thing 
if it occurs in the first grade or the seventh grade. The 
assumption is that it is just as difficult to improve a first 
grade treatment group by seven NCEs as it is to improve a 
seventh grade treatment group by seven NCEs. An examination 
of Table 8 suggests that a gain of seven NCEs in Total 
Mathematics at the seventh grade level represents a 200 percent 
increase in achievement rate, whereas the same seven point gain 
at the first grade represents a 33 percent increase in achievement 
rate. The question arises as to whether achievement rate ex- 
pressed in SGEs must be considered in interpreting 
effect. If all the impact of school, community, home, and 
social forces can only cause a Total Reading SGE of .3. 7 NCEs 
for the average seventh grader nationally, then is it fair to 
expect a Title 1 program to show a treatment effect of seven 
NCEs above and beyond the SGE of 3,7? Perhaps the ratio of . 
treatment effect to SGE would provide a more comparable index 
across grades, tests, and subtestsV When the SGE is considered, 
a number of difficult questions arise regarding the meaning of 
treatment effect and the wisdom of aggregating across either grades 
or tests. At the present time not enough is knoira to judge the 
value of the SGE as a statistic for communicating treatment effects 
in a comparable unit* A special report is forthcoming that will 
present this concept in fuller detail and, hopefully, discuss 
the contributions, if any, that the SGE promises to make to evalu» 
at ion methodology. 



Summary 



Table 13 summarizes the treatment effect in Total Reading 
and Total Mathematics as documented by the norm- referenced and 
control group models. it is interesting to note that the two 
modi^ls yield similar estimates of treatment effect for first 
grade reading and mathematics achievement , but substantial dif- 
ferences in treatment effect are evident at second and third grade. 
The fact that all assumptions of both the norm-referenced and 
control group models we're met at the first grade level generated 
confidence in the accuracy of these estimates of treatment effect. 
The widely divergent estimates for second grade and the mbderately 
similar estimates at third grade suggest that both models may 
be highly sensitive to the types of assumption violations which 
are encountered in typical applications of these two models* It 
seems plausible that when low achievers in a class are 
removed for one to two hours per day of special laboratory 
instruction, the remaining group can progress at a faster pace. 
This is because the student/teacher ratio is lowered and because 
the group is academically more homogeneous. Also, the likelihood 
is that equipment, materials, and teacher inservice (bought by- 
Title I funds) benefit all students in the class. It becomes 
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Tablfi 13 , 

SuiiiiTiory of Tieatrnein EFfcGti (in NCEs) at Grades 1,2, and 3 for Total Rooding and 

Total Matliemitics 



Total Reading ^ Total MathDmatics 

CmOB X 2 3. 1 2 3 

Noun RufLMuricod 

Modiil Esrunatus 3.9 9.9 3.1 6.2 8.1 7,9 

of Trijtitmcnl 



Control Group 

Mndol Gsrimutug 5,5 -2.0 2.3 6.7 0.9 3*0 
of Tifuitrncnt 

Average Effect 4.7 6.0 2.6 6.5 . 4.5 . 5,5 



clear that non-Title Students are receiving a "traatinent" by vir 
tue o£ the fact that Title I students obtain project services* 
Although the full magnitude of this unexpectedj positive outcome 
is not yet known^ this year' s findings confirm that non»Title 
students are achieving much better than would be escpected from 
their pretest scores , A fuller treatment of the results support 
ing^ Wrs-^iindiitg^wi-^^^ e c i ai^r ep o r t^-*--Iii— 

addition^ the issue of differential growth as measured by SGEs 
across various grade levels definitely merits serious study* 

One puzzlement continues and should be investigated further 
Why do Title I students do so well in the D* C. Title I program 
and. then lose a major portion of their newfound advantage over 
the summer? This phenomenon is nationwide ^^and should not be 
considered an anomaly of the D*C* program. However, its 
widespread appearance does not reduce local responsibility for 
finding an explanation. 
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